Scaling in Ordered and Critical Random Boolean Networks 
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Random Boolean networks, originally invented as models of genetic regulatory networks, are 
simple models for a broad class of complex systems that show rich dynamical structures. From 
a biological perspective, the most interesting networks lie at or near a critical point in parameter 
space that divides "ordered" from "chaotic" attractor dynamics. In the "ordered" regime, we 
show rigorously that the average number of relevant nodes (the ones that determine the attractor 
dynamics) remains constant with increasing system size A*'. For critical networks, our analysis and 
numerical results show that the number of relevant nodes scales like N^^^ . Numerical experiments 
also show that the median number of attractors in critical networks grows faster than linearly with 
N . The calculations explain why the correct asymptotic scaling is observed only for very large A*'. 
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A random Boolean network (RBN) is a collection of N 
binary logic gates, or nodes, wired together in a random 
fashion, with each node implementing a randomly cho- 
sen logical function of its inputs. RBNs are paradigms 
for systems in which excitatory and inhibitory interac- 
tions occur among a large set of interacting elements. 
One example of great current interest is the regulatory 
network that governs gene expression in a cell. It has 
been suggested that the distinct dynamical attractors of 
a single RBN be interpreted as distinct cell types carrying 
the same genetic information. Q Surprisingly, RBN at- 
tractors can exhibit many features of biological cells, in- 
cluding stability against random external perturbations, 
qualitative change in response to special perturbations, 
and plausible scaling laws for numbers of attractors and 
attractor cycle lengths. |^ It therefore appears impor- 
tant to understand the behavior of RBNs as a first step 
in determining relevant global properties that might be 
probed in real gene expression experiments. 

Even very simply constructed RBNs with determin- 
istic updating rules can exhibit a rich set of dynamical 
behaviors. We focus here on the case in which each node 
has the same number of inputs, K. Fig. [l] shows an ex- 
ample with K ^ 2. Each node i implements a truth 
function Fi (e.g. AND, XOR, etc. for K=2) that is cho- 
sen at random from a weighted distribution of all of the 
2^ possible truth functions on K binary inputs. On 
each discrete time step, the outputs are updated syn- 
chronously. Since the number of states of the system is 
finite (equal to 2^) and the system is deterministic, for 
any initial condition the network must eventually settle 
into a periodic attractor. We are interested in the be- 
havior of large N networks. How many attractors do 
they have? How many nodes typically participate in the 
attractor dynamics? 

It is well known that tuning the probabilities of differ- 
ent F's can produce an order-chaos transition. (See ||] 
for a thorough review.) In the ordered regime, almost all 
nodes are frozen and attractor cycles are short. In the 
chaotic regime, on the other hand, the number of fluctu- 
ating nodes is a finite fraction of N and attractor cycles 




FIG. 1; A is: = 2, = 10 network. Each node has two 
inputs, but the number of outputs (drawn from its center) 
can vary. See text for details. 



can be quite long. For large N, there is a narrow critical 
regime between these phases. The networks of greatest 
interest for biological systems are conjectured to lie near 
the critical regime, on the ordered side. |^ 

In this Letter, we present numerical and analytic re- 
sults that clarify the dynamical structure of RBNs in the 
ordered and critical regimes, revealing several surprising 
features: (1) for RBNs in the ordered regime the average 
number of relevant nodes (defined below) remains finite 
for N ^ oo and they are organized into trivial loops; (2) 
in critical RBNs, the average number of fluctuating nodes 
grows like N'^^^; (3) the system sizes required to observe 
the asymptotic regime can be extremely large, especially 
for K = 2; and (4) the median number of attractors in 
critical RBNs grows faster than linearly with N, at least 
for TV up to 1200. Both (2) and (4) contradict previous 
claims ([Q and [^, respectively) which we believe to have 
been based on studies that did not consider sufficiently 
large N. (4) also supersedes an old claim by one of us 
that the median number of attractors grows like \/N in 
critical networks. 

The concepts of "relevant" nodes and "canalizing 
inputs" [0 are essential to our analysis. In any given 
network, there may be nodes whose outputs are frozen 
at the same value on every attractor. Such nodes serve 
only to fix inputs to other nodes and are otherwise "irrel- 
evant" . There may also be nodes whose outputs go only 
to irrelevant nodes. These are also classified as irrelevant. 
Though they may fluctuate, they act merely as slaves to 



the nodes that determine the attractor cycle. 

Almost all the irrelevant nodes can be found as fol- 
lows. 1^ One first identifies "fixed" nodes whose outputs 
are entirely independent of their inputs. One then uses 
an iterative procedure to identify nodes that must be 
frozen because their outputs depend only on inputs from 
other frozen nodes. We call all frozen nodes identified 
this way "clamped" , let s denote their number, and de- 
fine u = N — s. A similar iterative procedure is then 
used to remove (or "prune" ) nodes with no relevant out- 
puts. For example, in Fig. |l|, even if nodes 1 — 5 are 
all undamped, nodes 6 — 8 can be pruned. For the pur- 
poses of this paper we designate all nodes that are neither 
clamped nor pruned via the described procedure as "rele- 
vant" , and let r denote their number. (Additional nodes 
may be frozen due to correlations between two or more 
undamped inputs |^ , so r is greater than or equal to the 
number of truly relevant nodes.) 

A canalizing input to a given node is one that can be 
set to a value that determines the output, independent of 
the other inputs. (For example, if either input to an OR 
function is set to 1, the output is determined.) Following 
1^, we define a set of parameters pk as follows. For a 
randomly selected F, fix a randomly selected K~k inputs 
at arbitrarily chosen values, pk is the probability that 
those input values are collectively canalizing; i.e., that 
F is independent of the k remaining inputs. Note that 
Po — \ (fixing all the inputs certainly determines the 
output) and px is the probability that a node is fixed 
(i.e., that its output is independent of all of its inputs). 

The order-chaos transition can be observed by tuning 
the PkS. One simple way to do this is to assign to each 
F a probability that depends only on the number of 1 
in the output column of its truth table. For p g [0, 1], 
we let the probability that a node has truth function F 
be p^^q^^''^, where q ^ {1 — p). It is straightforward 
to check that this parametrization corresponds to pk = 
p^ + q'^ . With this weighting of the F's, the transition 
occurs at 2Kpq — 1, with 2Kpq < 1 corresponding to 
the ordered regime |^, For our numerical studies, 
we set K = 2 and vary p. 

We have carried out two types of numerical experi- 
ment on K — 2 networks. First, for 1000 networks at 
each of five p values, we determine u and r. Fig. |^(a) 
shows the measured (r) for iV up to 3 x 10^. It ap- 
pears that (r) approaches a constant at large N for p in 
the ordered regime. At the critical value p — 1/2, the 
data are inconclusive, showing significant curvature on 
the log-log plot out to the largest N we have studied. 
They are consistent, however, with an asymptotic scal- 
ing law (r) ^ N^/^. Panel (b) shows the (u) for p — 1/2, 
indicating a clearer power-law scaling (u) ~ N'^/^ . 

Second, for at least 1000 networks at each p, we at- 
tempt to measure the number of attractors, on the set 
of relevant nodes. In some networks, however, A is pro- 
hibitively large, making measurements of {A) difficult. It 




FIG. 2: Average numbers of relevant nodes in K = 2 networks 
with truth functions specified by p. Vertical dashed lines in- 
dicate theoretical predictions for the crossover from critical to 
ordered behavior at each p. Horizontal segments at the right 
indicate theoretical calculations of the asymptotic values of 
(r) for each p. Diagonal lines are guides to the eye, showing 
slopes of 1/2 and 1/3. (b) Average numbers of undamped 
nodes in K = 2 critical (p = 1/2) networks. The dashed line 
has slope 2/3. 



is much easier to measure the median, A^ since one need 
not continue to count attractors in a given network after 
the count has exceeded the median. To count attractors, 
we repeatedly choose random initial conditions and iden- 
tify the attractor reached. If 1000 consecutive attempts 
yield no new attractor, we record the number of attrac- 
tors found and move on to another network. This gives 
a lower bound on A for each network, and hence a lower 
bound on A. 

Fig. 1^ shows the results for N up to 1200. We note 
that in where measurements of {A) were sought, it 
was not possible to consider nets larger than TV — 144. 
From Fig. however, it is clear that any extrapolation 
based on data for N smaller than about 500 at the crit- 
ical point is suspect. We see a faster than linear rise in 
the median A above N ~ 500, which almost certainly 
implies a faster than linear rise in {A) as well. Moreover, 
Fig. ||(a) strongly suggests that one must study iV > 10^ 
to observe the true asymptotic behavior! 

A simple calculation provides a rigorous upper bound 
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FIG. 3: Median numbers of attractors for K = 2 networks 
with truth functions specified by p. Error bars indicate one 
standard deviation. Note the upward curvature for the critical 
value p = 1/2. 
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on (it) and explains why asymptotic scaling sets in only 
for very large N. A technical note: in the networks stud- 
ied numerically, gates were not permitted to have two 
inputs from the same node. The analysis below ignores 
this constraint, which yields 0{1/N) effects. In all cases, 
self inputs are allowed. 

Let P{u) be the probability that a randomly selected 
network has u undamped nodes. To compute P{u), we 
should count the networks with u undamped nodes and 
divide by the total number of networks T ~ [tN^]^ , 
where r = 2^ is the number of possible truth functions 
for each node. We begin by considering a quantity P'{u) 
that is guaranteed to exceed P{u): 



P'(u) = l C{N,u) 



K 



K 



Lk=0 



,(1) 



where = 1 — pk and C{m, n) is the number of combi- 
nations of n objects drawn from m. The factor C'{N, u) 
is the number of ways of having u undamped nodes. The 
first sum counts the ways that a node can be clamped, 
weighted by the probability of its truth function: the 
node in question can have up to k unstable inputs as 
long as the other K — k are collectively canalizing, which 
occurs with probability pk- Simlarly, the second sum 
counts the ways a node can be undamped. 

P'{u) overcounts the probability of having u un- 
damped nodes because it ignores the constraint that all 
clamped nodes must be traceable through a sequence of 
inputs back to a fixed node. That is, the first sum over- 
counts the number of ways that s clamped nodes can 
be wired, as it includes graphs in which a subset of the 
s nodes collectively clamp each other without any con- 
nection to a fixed node. For example, suppose that in 
Fig. |l| nodes 1 and 4 are fixed, node 9 implements OR, 
and the output of 10 is simply equal to its input from 
9. Nodes 2, 3, and 5 — 8, are clearly clamped through 
inputs that can be traced back to 1 and 4. The sum 
then counts this network as a possible arrangement of 
the clamped nodes for u = because it is self-consistent 
to assume that both 9 and 10 are clamped. (Note that 
after a few time steps 9 and 10 will either both be stuck 
on or both on 1.) However the iterative procedure 
defined above would (correctly) not identify 9 and 10 
as clamped, so its inclusion in -P'(O) constitutes over- 
counting. Note that the same network is also (properly) 
counted in P'{2). Thus P'{u) > P{u) for all m, implying 
that Y.u9{u)P'{u) > Y.u9{u)P{u) for any g{u) > 0. 

If u is small compared to N, a useful approximation 
to Eq. (P can be obtained. Using s = N ~ u and using 
Stirling's formula to simplify the binomial coefficients, to- 
gether with the identity (l-t-x/n)" = exp[x{l— (x/n) /2 + 



(x/n)V3 + ...)], we find 
P'{u) 



^27ru(l - u/N) 



cxp 



where 



71 = 
92 = 

h = 



Q-1-lnQ, 

(l-Q)(l + g-2/v + 2Q2)/2, 



-O2 -P3-Q3 + {K - Q)P2 + -Ql 



\K-l + 2{K~Qf]/(i, 



(2) 

(3) 
(4) 

(5) 



with Q = Kqi, Qn = C{K,n)qn/Q, and P„ = 
C{K, n)pn- The calculation involves only straightforward 
algebra and the assumption that terms of order /N^ 
can be neglected in the exponent, which is justified when- 
ever 01 , 62, and 03 are all positive and at least one of them 
is nonzero. Under such circumstances, and in the limit 
of large iV, P'{u) is exponentially strongly suppressed for 
It's larger than order N^^^, whereas the neglected terms 
would only become relevant for u's of order N^^^. 

Q is equivalent to the order parameter defined by Fly- 
vbjerg in 1^, where it was also argued that Q = 1 marks 
the critical boundary. Eq. (H) proves that for all Q < 1 
the P{u) decays exponentially with a decay length inde- 
pendent of A^. It also strongly suggests that this is not 
the case at Q = 1, though the computation only gives 
an upper bound. The fact that 61 and 62 both vanish at 
Q = 1 is a surprising result that affects the scaling of (r) 
in critical networks, as we shall see below. 

In the ordered regime Q < 1, we have 9i > 0, so higher 
order terms are irrelevant at sufficiently large N and 
P'(u) becomes independent of N. Thus (u) for asymp- 
totically large N is bounded above by a constant. For the 
case where pk is determined simply by the one parame- 
ter p, we have qi — 2p(l — p) and the critical p satisfies 
2Kpc{l —Pc) = 1, consistent with previous studies of the 
effect of p. For p near pc we get 



8ip-Pc)' 

2K{K-2){p-p,f 



ior K = 2 
for K>2. 



For K = 2, then, 9i is quite small even for p relatively far 
frompc = 1/2. For example, p = 0.55 gives Oi ~ 5x 10^^. 
The asymptotic behavior of the system should only be 
apparent when u is bigger than \/9i and hence when 
N is substantially larger than that. The vertical dashed 
lines in Fig. ||(a) are drawn at A'' = 3/0i, which is seen 
to give an accurate indication of where effects associated 
with the ordered regime become important. 

Moreover, for very large A^, we may obtain an accu- 
rate approximation to P{u) simply by normalizing P'{u), 
since the overcounting factor relating the two approaches 
some nonzero constant at u = 0. This allows us to com- 
pute (r) (not just an upper bound on it) as follows. Con- 
sider the undamped nodes and the links between them. 



4 



These form a random graph subject only to the constraint 
that each node has between 1 and K inputs. But from 
the second sum in Eq. we know the relative proba- 
bilities that an undamped node will have k undamped 
inputs. Specializing to K — 2 for simplicity, the average 
number of inputs per node for fixed u is 



{z) = {Qsu + 2q2U^) I {Qsu + q2U^) 

i' 

QN ' \Q Q2 



1 + ^4^ 



Now in the ordered regime, where the exponential cutoff 
in P{u) is independent of A^, we have / duP{u)u/N ^0 
for all M as iV ^ oo and hence (z) 1. Thus the net- 
work of undamped nodes at large N is just a randomly 
wired K — 1 network with no fixed nodes. Exact com- 
binatorics for if = 1 random graphs show that the ex- 
pected number of loops of size n in a network of size u is 
Lu{n) = u\/[n(u — n)\ {u — 1)"]. Since all nodes not 
in loops can be pruned, we obtain 



N 



(7) 



The horizontal dashed lines on the far right in Fig. ||(a) 
mark this prediction, which agrees well with the data. 
For the critical case Q = 1, we have 



P'{u) ~ [2ttu{1-u/N)]-^/'^ 



exp 



(8) 



which implies that (u) cannot grow faster than N^^^. 
This contradicts a previous argument suggesting (u) ~ 
N^/"^ 0- Note however, that the corrections from terms 
of order u^/N'^ cannot be neglected unless {N'^^^)^ /N^ = 
7V-1/9 << 1^ suggesting that the critical scaling emerges 
only for iV > 10^, as confirmed by Fig. ||. 

Naive normalization of P'{u) to get P{u) yields (u) ~ 
N'^/^ and Fig. ||(b) indicates that this is correct. Given 
the scaling law and the fact that paths connecting rel- 
evant nodes cannot lead to dead ends, the theory of 
directed random graphs and Eq. (H) can be used 
to show that (r) ~ N^^^, consistent with the numeri- 
cal trend seen in Fig. ^(a). Unfortunately, naive nor- 
malization does not yield an accurate prediction for the 
full form of P{u) because the u dependence of the factor 
f{u) = P{u)/P'{u) becomes important. Determination 
of the precise form of f{u) is beyond the scope of this 
work. 

We now address the question of the number of at- 
tractors. For the case Q < 1 and very large N, the 
relevant nodes form trivial loops with only two possible 
truth functions: F{a) — a ("identity") and F{a) = 1 — cr 
("not"). For n prime, a loop of size n has either (2" — 2)/n 
or (2" — 2)/2n attractors, depending on whether the 
number of nots in the loop is even or odd. For n not 
prime, the number of attractors is slightly larger. ||ic|] 



Now L„(n)(2" — 2)/n is extremely sharply peaked very 
close to n = u/2. {A)u is thus dominated by rare net- 
works that contain a relevant loop of size u/2, which gives 
{A)u ~ 2'^"/^, with c a constant of order unity. Naive av- 
eraging over P{u) gives a divergent answer; a consistent 
calculation would require inclusion of the terms in P{u) 
that cause rapid decay for u of order N. 

The median number of attractors. A, can be estimated 
as 2^/b, where b is the size of the biggest loop of relevant 
nodes in a given network. For fixed it, the probability 
of occurence of b is Pu{b) = ■^«(&) 11^=6+1 [1 - Lu{n)]. 

To find b, we numerically solve X]&=o ^uLo P{'^)Puib) — 
1/2. The results for the networks studied in Fig. ^ are 
A = 3, 5, and 14 for p ~ 0.7, 0.65, and 0.6, respectively. 
This result and its rough agreement with the data (see 
Fig. H) confirms that A approaches a constant at large N 
for any sub-critical p |l2| and illustrates the qualitative 
distinction between {A) and A. 

In closing, we note that the classification of nodes as 
relevant is robust across a large class of updating rules. 
The results on (r) reported here apply to asynchronous 
models and to models in which there is a stochastic time 
delay in the updating of each node, as long as the update, 
whenever it occurs, is always accurate. Finally, we note 
that the numbers of genes in real cells are on the order 
of 10*, which our results show may be too small to ex- 
hibit asymptotic large N behavior. Thus, networks with 
canalization parameters that nominally place them in the 
ordered regime will exhibit features of critical networks, 
which may be important for biological function. 
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